Fingerprinting hyperglycemia using predictive modelling approach based on low-cost routine CBC and CRP diagnostics

Hyperglycemia is an outcome of dysregulated glucose homeostasis in the human body and may induce chronic elevation of blood glucose levels. Lifestyle factors such as overnutrition, physical inactivity, and psychosocials coupled with systemic low-grade inflammation have a strong negative impact on glucose homeostasis, in particular, insulin sensitivity. Together, these factors contribute to the pathophysiology of diabetes (DM) and expanding landscape of its prevalence regionally and globally. The rapid rise in the prevalence of type 2 diabetes, therefore, underscores the need for its early diagnosis and treatment. In this work, we have evaluated the discriminatory capacity of different diagnostic markers including inflammatory biomolecules and RBC (Red Blood Cell) indices in predicting the risk of hyperglycemia and borderline hyperglycemia. For that, 208,137 clinical diagnostic entries obtained over five years from Chugtai Labs, Pakistan, were retrospectively evaluated. The dataset included HbA1c (n = 142,011), complete blood count (CBC, n = 84,263), fasting blood glucose (FBG, n = 35,363), and C-reactive protein (CRP, n = 9035) tests. Our results provide four glycemic predictive models for two cohorts HbA1c and FBG) each having an overall predictive accuracy of more than 80% (p-value < 0.0001). Next, multivariate analysis (MANOVA) followed by univariate analysis (ANOVA) was employed to identify predictors with significant discriminatory capacity for different levels of glycemia. We show that the interplay between inflammation, hyperglycemic-induced derangements in RBC indices, and altered glucose homeostasis could be employed for prognosticating hyperglycemic outcomes. Our results then conclude a glycemic predictor with high sensitivity and specificity, employing inflammatory markers coupled with RBC indices, to predict glycemic outcomes (ROC p-value < 0.0001). Taken together, this study outlines a predictor of glycemic outcomes which could assist as a prophylactic intervention in predicting the early onset of hyperglycemia and borderline hyperglycemia.


Fingerprinting hyperglycemia using predictive modelling approach based on low-cost routine CBC and CRP diagnostics
Amna Tahir 1 , Kashif Asghar 2 , Waqas Shafiq 3 , Hijab Batool 4 , Dilawar Khan 4 , Omar Chughtai 4 & Safee Ullah Chaudhary 1* Hyperglycemia is an outcome of dysregulated glucose homeostasis in the human body and may induce chronic elevation of blood glucose levels.Lifestyle factors such as overnutrition, physical inactivity, and psychosocials coupled with systemic low-grade inflammation have a strong negative impact on glucose homeostasis, in particular, insulin sensitivity.Together, these factors contribute to the pathophysiology of diabetes (DM) and expanding landscape of its prevalence regionally and globally.The rapid rise in the prevalence of type 2 diabetes, therefore, underscores the need for its early diagnosis and treatment.In this work, we have evaluated the discriminatory capacity of different diagnostic markers including inflammatory biomolecules and RBC (Red Blood Cell) indices in predicting the risk of hyperglycemia and borderline hyperglycemia.For that, 208,137 clinical diagnostic entries obtained over five years from Chugtai Labs, Pakistan, were retrospectively evaluated.The dataset included HbA1c (n = 142,011), complete blood count (CBC, n = 84,263), fasting blood glucose (FBG, n = 35,363), and C-reactive protein (CRP, n = 9035) tests.Our results provide four glycemic predictive models for two cohorts HbA1c and FBG) each having an overall predictive accuracy of more than 80% (p-value < 0.0001).Next, multivariate analysis (MANOVA) followed by univariate analysis (ANOVA) was employed to identify predictors with significant discriminatory capacity for different levels of glycemia.We show that the interplay between inflammation, hyperglycemic-induced derangements in RBC indices, and altered glucose homeostasis could be employed for prognosticating hyperglycemic outcomes.Our results then conclude a glycemic predictor with high sensitivity and specificity, employing inflammatory markers coupled with RBC indices, to predict glycemic outcomes (ROC p-value < 0.0001).Taken together, this study outlines a predictor of glycemic outcomes which could assist as a prophylactic intervention in predicting the early onset of hyperglycemia and borderline hyperglycemia.
Hyperglycemia is an outcome of altered glucose homeostasis due to impaired insulin secretion and varying degrees of peripheral insulin resistance 1 .Hyperglycemia is a key component in the pathophysiology of diabetes due to glucose dysregulation 2 .Multiple lifestyle factors like rapid urbanization, aging populations, and increasing obesity due to sedentary lifestyle, inflammation, comorbidities, and genetic risk factors have put an everincreasing number of people at risk of developing diabetes mellitus (DM) 2,3 .An epidemiological study on the global prevalence of diabetes reported that the prevalence of diabetes for all age groups will reach 4.4% of the human population in 2030 3 .Further, The International Diabetes Federation (IDF) reports that approximately 537 million people in the world suffer from diabetes 4 , out of which around 88,000 patients end up losing their lives every day 5 .
In the specific case of Pakistan, world's 5th populous country 6 , studies estimating prevalence of diabetes are showing an alarming increase in prevalence of the condition across all segments of the population [7][8][9] .Limited governmental support for monitoring and controlling DM is extracting a hitherto unaccounted cost from the

Study population and selection criteria
HbA1c and FBG-the two most common glycemic indicators, were utilized as grouping variables for splitting the sample into two sub-cohorts for the classification of glycemic status.The criteria defined by the American Diabetes Association (ADA) in "Classification and Diagnosis of Diabetes: standards of medical care in diabetes-2022" 14 were used to classify three levels of glycemia.An individual was classified to have (1) hyperglycemia, if they fulfilled any one of the following criteria: Fasting blood glucose FBG > 126 mg/dl and HbA1c > 6.5%, (2) borderline

Model evaluation and ROC analysis
Two validation techniques were applied to assess the accuracy and robustness of the established models (i) the Back-substitution method by comparing the predicted classification of the discriminant function and the actual classification, the correct discriminant proportion of the classification function was calculated, and (ii) Jackknife (Leave-One-Out cross-validation).In model evaluation ROC analysis was performed, where a ROC curve shows the trade-off between the true positive fraction (TPF) and a false positive fraction (FPF) hence, providing a measure of sensitivity, specificity, and validity of our glycemic state predictor.The area under the curve (AUC) was measured with a 95% confidence interval and significance value.

Inflammatory markers and RBC status exhibit concomitance with glycemic variations
Data on clinical diagnostics with a total sample size (n = 208,137) was obtained.Two sub-cohorts were derived according to HbA1c (n = 142,011) and fasting blood glucose (FBG, n = 35,362) tests.Means (± SD) for 12 different clinical parameters were measured for the glycemic states categorized under HbA1c (Table 1A) and FBG (Table 1B).
Out of the 142,011 individuals who were tested for HbA1c, 60.94% had hyperglycemia (H), while 21.18% and 17.85% had borderline hyperglycemia (BH) and Normoglycemia(N), respectively, using American Diabetes Association (ADA) thresholds 14 .Higher rates of Hyperglycemia (53.45% vs 46.52%) and borderline hyperglycemia (54.81% vs 45.16%) were observed amongst males as compared to females (Table 1A), with no contrasting differences in the mean ages.Of the 35,362 individuals who got tested for FBG, 38.62% had hyperglycemia, 30.10% were borderline and 31.28% had normoglycemia.Males had higher rates of hyperglycemia (58.47%) as compared to females (41.51%).A similar trend was observed amongst males and females for borderline hyperglycemia (60.45% vs 39.53%), with similar mean ages, in the FBG classified sub-group (Table 1B).For both sub-groups (HbA1c and FBG), heightened mean expressions of CRP, WBC, Platelet, NLR, PLR, RBC, and HCT were observed in hyperglycemic state, whereas a negative trend was seen for RBC indices including MCV, MCH, and MCHC.Hb exhibited stable mean expression for all levels of glycemia (Table 1).
Together, our results show that with an increasing glycemia, CRP, WBC, Platelet, NLR, PLR, RBC, and HCT show an increasing trend whereas RBC indices tend to decrease.

CRP, WBC, NLR, and PLR are significant discriminants for differentiating glycemic control
To evaluate the inflammatory response present in different glycemic conditions, we performed a multivariate analysis of variance, MANOVA, separately for two sub-groups, HbA1c (n = 2877) and FBG (n = 616).Our results using Pillai's trace show that glycemic states (normal, borderline, and hyperglycemia) vary significantly with the five inflammatory markers (CRP, WBC, Platelet, NLR, PLR); group effect exhibiting F-ratio HbA1c of 182.27 (p < 0.0001) and F-ratio FBG of 37.249 (p < 0.0001) indicated by Pillai's trace.

A predictive model of non-specific inflammatory markers for estimation of dysglycemia
Linear discriminant analysis (LDA) was used to develop a glycemic prediction model comprising six clinical parameters out of which five were inflammatory markers (CRP, WBC, Platelet, NLR, PLR), where CRP is an indicator of chronic inflammation and one was glycemic indicator (HbA1c or FBG) to differentiate between different states of glycemia (hyperglycemia, borderline hyperglycemia, and normoglycemia).Results are displayed in Fig. 3A, B for HbA1c and FBG cohorts, respectively.Using the variances from all the values, two discriminant functions were derived, which accounted for 100% of the variance.For both the HbA1c and FBG cohort, the first canonical discriminant function contributed substantially towards the total variance in the dataset with more than 99% variance with a canonical correlation of 0.7, at a significance value p < 0.001 in both cases.
The classification discriminant functions (DF0, DF1, and DF2) were therefore generated based on the estimation of corresponding β values (Table 3A) for the HbA1c cohort and (Table 3B) for the FBG cohort.We applied LDA to develop a glycemic predictive model with enhanced accuracy by integration of the significant DVs, concluded from the outcomes of individual ANOVAs (Table 2C, D).Predictors of inflammation (NLR, PLR, and WBC count), predictors of erythrocytes status (RBC count, MCV, and MCH), and glycemic status indicator (HbA1c or FBG) were used to differentiate between different states of glycemia.Results are displayed in Fig. 4A, B for HbA1c and FBG cohorts, respectively.Two discriminant functions were derived, which accounted for 100% of the variance.For the HbA1c cohort (n = 50,116) first canonical discriminant function majorly contributed towards the total variance in the dataset with 99.9% variance and canonical correlation of 0.729, however for the FBG cohort (n = 13,861) first canonical discriminant accounted for 100% variance with the canonical correlation of 0.739, at a significance p-value < 0.001 in both cases.The classification discriminant functions (DF0, DF1, and DF2) were therefore generated based on the estimation of corresponding β values (Table 3C) for the HbA1c cohort and (Table 3D) for the FBG cohort.

An evaluation of model accuracy to predict dysglycemia from inflammatory markers
Discriminant classification results showed good separations of the three glycemic states for both cohorts with an accuracy of greater than 80% (Fig. 3C, D).Classification results for the HbA1c cohort (Fig. 3C) showed that the back substitution method can classify hyperglycemia with a correct discrimination proportion of 72%; for the borderline hyperglycemia subset, 95.2% and for normoglycemia cases with a correct discrimination proportion of 89.6%.Moreover, classification results for the FBG cohort (Fig. 3D) showed correct group membership of about 67% for hyperglycemia, 84.6% for borderline hyperglycemia, and for normoglycemia cases, it was 93.4%.For both cohorts, borderline hyperglycemia had the highest correct discrimination proportion results.To further evaluate the stability of the model discriminant functions, Jackknife cross-validation was employed, which showed almost similar classification accuracy for both the HbA1c (80.8%) and FBG (80.8%)LDA models.ROC (receiver-operating-characteristics curve) analysis of the model computed AUC and 95% CI values for each glycemic type for both cohorts.The model exhibited a strong diagnostic value for glycemic state with all ROCs showing AUC above 0.9 (p < 0.0001, Fig. 5).

An evaluation of joint model accuracy in the prediction of dysglycemia
Upon integration of significant candidates from RBC parameters with the inflammatory markers in the HbA1c model, the overall accuracy increased from 81.2 to 89.5% (Fig. 4C), thus providing evidence of a strong discriminatory value of parameter superimposition.Notably, no difference was observed for the FBG cohort (82.3% vs 82.8%) (Fig. 4D).Interestingly, the joint HbA1c cohort model showed 25% improvement in accuracy for hyperglycemia (72% vs 97.1%).The FBG cohort showed an increment of 10% in the predictive ability for borderline hyperglycemia (84% vs 94%).Jackknife cross-validation results were comparable with the classification accuracy for both HbA1c (89.5%) and FBG (82.6%)LDA models.Furthermore, ROC assessment established the diagnostic specificity and sensitivity of the joint model (p < 0.0001) with all ROC AUCs above 0.9 except for the HbA1c cohort in predicting borderline hyperglycemia (AUC = 0.87).Results are illustrated in Fig. 6. www.nature.com/scientificreports/

Low cost and high accuracy risk fingerprinting of dysglycemia
The prediction accuracy for the inflammatory ("M1": Eqs.1-3, "M2": Eqs.4-6) and joint ("M3": Eqs.7-9, "M4": Eqs.10-12) model for both HbA1c and FBG are provided in Table 4. M1 provided the highest discrimination proportion of 95.2% for predicting borderline hyperglycemia but had the highest cost (~ $20).M4 had 94% prediction accuracy for borderline hyperglycemia with the lowest price at just ~ $5. Importantly, M3 had the highest discriminatory power for the correct classification of hyperglycemia (97.1%) and normoglycemia (94.2%) cases.In the case of the FBG cohort, the overall predictive capacity for both models M2 vs M4 (82.3% vs 82.8%) was comparable but at 2.5 times the price difference ($12 vs. $5).Overall, amongst all the reported models, M3 provided the highest cumulative predictive accuracy of 89.5% for hyperglycemia, borderline hyperglycemia, and normal cases.In conclusion, M3 and M4 models could be utilized for population-level screening programs and by clinicians for predicting hyperglycemia and borderline hyperglycemia at a lower cost.

Discussion
In this study, we have investigated the interplay between inflammation, RBC parameters, and hyperglycemia by employing clinical diagnostics of CBC, CRP, HbA1c, and FBG (fasting blood glucose) towards developing a predictive model of glycemic outcomes.Our results show that variations in inflammatory profile in addition to derangements in RBC indices can be formulated into a powerful predictive tool for measuring dysglycemia with considerable precision.This proposed approach can be particularly useful in population-level risk fingerprinting of DM and evaluation of patient's health outcomes concerning hyperglycemia.Aberrations in the immune system are central to the incidence and progression of DM 17,18 .Modern research has also furnished evidence on the role of inflammation in the onset of pro-inflammatory pathways in insulin production which then lead to the initiation of metabolic disorders including DM 38 .CRP (c-reactive protein) is one such chronic inflammatory marker that has a direct association with the risk of type 2 diabetes 21,38 .The current study is consistent with literature reports that an increasing trend in CRP was observed in hyperglycemia, in comparison to normal or borderline state (Table 1A, B).We, therefore, propose a glycemic status (normoglycemia, borderline-hyperglycmeia, hyperglycemia) prediction model by using glycemic indicators in concert with other routine clinical diagnostics (CBC and CRP).For that, we started off by developing a MANOVA model (categorical glycemic status as independent, diagnostic variables as dependents) to evaluate the significance of multivariate association between different parameters.Follow-up univariate individual ANOVAs and multiple pairwise comparisons (Fig. 1) also indicate CRP to be significantly different amongst three levels, for both HbA1c and FBG cohorts.WBC-a component of CBC, is also a nonspecific indicator of inflammation and is reported to be a predictor in the pathogenesis of diabetes 20,25 .Furthermore, NLR and PLR were estimated from diagnostic data and used as inflammatory indices to associate hyperglycemia with inflammation.NLR and PLR are well reported to have predictive power for DM [27][28][29] with studies reporting NLR and PLR values to be higher in DM 39,40 .The results of the current study are in agreement (Table 2) showing the levels of WBCs, NLR, and PLR to be significantly different among the three glycemic groups and significantly raised from normoglycemia to hyperglycemia (Figs. 1 and 2).
Moreover, other hematological counters in CBC including RBCs can provide insight into the state of glycation.Since RBCs are sensitive to changes in plasma composition, therefore, long-term hyperglycemia alters RBC physiology and associated indices 32,35,41 .In the current study, a significant difference was observed in RBC count, MCV, and MCH (p-value < 0.01) between different glycemic states (Supplementary Table S3 and S4).A negative trend was seen for MCV and MCH with the increase in HbA1c and FBG for both cohorts (Fig. 2E-G,  L-N) which are in line with studies that report a negative correlation between HbA1c and MCV and MCH 42,43 .Therefore, the employment of RBC-related indicators can also provide a useful reference for the diagnosis and prognosis of diabetes.Furthermore, several studies report that Hb variants in hemoglobinopathies and anemias interfere with the accurate measurement of HbA1c [44][45][46] .In the current study, the superimposition of RBC indices Figure 4. Linear discriminant analysis (LDA) results for joint predictive model of inflammatory markers in combination with erthrocytes status for estimation of dysglycemia (A) shows the HbA1c cohort Combined plot of the discriminant functions generated from 3 inflammatory parameters (NLR, PLR and WBC count) predictors of erythrocytes status (RBC count, MCV and MCH) and glycemic status indicator (HbA1c or FBG).Each data point represents a single reading in the study sample.The plot illustrates close but distinctive clustering and separation of hyperglycemia (grey circles), borderline hyperglycemia (yellow squares) and normoglycemia (blue diamonds).The dark grey square represents the group centroid.Dashed dark blue line depicts linear decision boundary.(C) Classification results for back substitution method for HbA1c cohort with an accuracy of 89.5% original grouped cases correctly classified and after jackknife cross validation shows 89.5% accurate results.Linear discriminant analysis (LDA) results for FBG cohort.(B) Combined plot of the discriminant functions generated from inflammatory and RBC predictors.Each data point represents a single reading in the study sample.The plot illustrates close but distinctive clustering and separation of hyperglycemia (grey circles), borderline hyperglycemia (yellow squares) and normoglycemia (blue diamonds).The dark grey square represents group centroid.Dashed dark blue line depicts linear decision boundary.(D) Classification results for back substitution method for FBG cohort with an accuracy of 82.8% original grouped cases correctly classified and after jackknife cross validation shows 82.6% accurate results.Individual group classification is highlighted in bold, model for HbA1c is showing highest accuracy in predicting hyperglycemia (97.1%) whereas in the case of FBG showing borderline hyperglycemia to be grouped with maximum accuracy of 94% of the original grouped cases followed by normoglycemia which is showing 92.6% accuracy.on inflammation improved the discriminatory power of the model for three states of glycemia.Hence, proving to be a versatile tool in parallel with HbA1c and FBG, for effective assessment of hyperglycemia.
According to the molecular and inflammatory descriptors in the proposed models, each biomarker contributed in predicting different glycemic outcomes.In addition, we attempted to establish reliable LDA models which could reveal the underlying dimensionality of the data while specifying the contribution of each parameter to the glycemic status group classification.Taken together, the combined results from both the post-hoc univariate and multivariate analyses exhibited significant group separation in the multivariate space.This provided evidence that the selected parameters manifested multivariate characteristics, which made it imperative to employ a multivariate follow-up technique to decipher the latent dependence within the dataset towards formulating a glycemic status prediction model.This approach is also in light of prior literature [47][48][49][50] which highligts follow-up employment of LDA-a post-hoc method of choice after MANOVA for predictive modelling.Note that LDA uniquely served both as an interpretive technique along with graphical representation of the classified data.
These models inculcated the influence of inflammation and derangement in various hematological parameters in the prediction of three states of glycemia, as independent variables.Moreover, it could be projected that the establishment of the discriminant functions that are based on inflammatory descriptors provided significant predictive power but the integration of RBC indices enhanced the prediction accuracy by 8.3% (from 81.2 to 89.5%).The improved model shows the full valuation of HbA1c (%) being explained by six types of biomarkers which consequently enhanced the discriminatory power of the model.The presented discriminant equations (Eqs.7-9) for the models could be used as an in silico screening tool for the prediction of glycemic outcomes.Different models could be opted for according to the type and budget of the screening program (Table 4).
To compare the performance and utility of supervised learning methods, we also evaluated other ML techniques including Multinomial Logistic Regression (MLR), Multilayer Perceptron (MLP), and K-Nearest Neighbors.For the case of MLR, the model demonstrated that one predictor (HbA1c or FBG) significantly outweighs others in terms of coefficient magnitude.MLP, on the other hand, though learned the complex nonlinear All results significant at p-value < 0.001.relationship between predictors and outcomes, was adversely affected by one input neuron (HbA1c or FBG) as revealed by feature importance analysis.KNN model also exhibited limited efficacy in overcoming this issue.This underperformance of models can be attributed to several factors which makes it essential to acknowledge that the choice between classification techniques is contingent upon not just on the parametric and non-parametric assumptions but also on the characteristics of the dataset, and the presence of multicollinearity 51 among variables.In line with this, existing literature suggests that LDA, under certain conditions and in comparison to other techniques like logistic regression, multinomial logistic regression, random forests, support-vector machines, and the K-nearest neighbor algorithm, performs better in group membership prediction 52,53 .Tharwat et al 54 conducted experiments with different datasets to investigate the effect of eigenvectors in LDA space on the  www.nature.com/scientificreports/robustness of the extracted features for classification accuracy.Likewise, Tao Li 55 suggested in their experimental investigation that LDA proves to be a fast and accurate option for the multi-class classification problems.This study demonstrates that LDA modeling can aid in the cost-effective screening of hyperglycemia and borderline hyperglycemia using data from simple routine lab diagnostic tests.Further, in conjunction with normal examination procedures, this tool could assist in better diagnosis and management of diabetes.
In terms of limitations, this study suffers from the unavailability of the DM status amongst hyperglycemic sub-groups.Also, the current study utilized clinical data of walk-in visitors in clinical diagnostic laboratories, which includes both patients and healthy individuals.Therefore, as a future extension, we propose to deploy the reported models in clinical settings where diagnosis information could be utilized for fine-tuning the models towards the development of higher-accuracy screening models.

Conclusion
The increasing burden of diabetes makes it imperative to investigate prophylactic interventions in comparison to treatments, through early detection of diabetes.Reliance on FBG alone might result in under-reporting as patients may be asymptomatic or adhering to strict dietary regimens besides the usage of medications before testing.For that, clinical diagnostics data can be used to screen patients for detecting the early onset of diabetes, onwards investigations, as well as disease management.The proposed model provides a low-cost platform with considerable accuracy for detection of hyperglycemia which in succession would have the capacity to improve the quality of life by checking the treatment cost and comorbidities.Together, the smart screening tool could assist in informing DM investigations, and its management along with the prevention of clinical complications related to chronic hyperglycemia.

Table 2 .
Univariate tests of between-subject effects shown for the MANOVA models.Estimation of the individual ANOVA results for Dependent variables (DVs) showing their dependency on independent variable (IV).(A) & (B) showing models that are fitted to contain data from five DVs (CRP, WBC, platelet, NLR and PLR) which are indicative of inflammation and one IV that is glycemic levels.(A) summarizes ANOVA results for HBA1c cohort and (B) for FBG cohort.(C) & (D) represents the model in which molecular markers (RBC, MCH, MCHC, MCV, HCT & Hb) are superimposed on inflammation (WBC, platelet, NLR and PLR) and model is fitted to have 10 DVs and 1 IV (HbA1c or FBG).(C) summarizes the ANOVA results for HbA1c cohort and (D) for FBG cohort.The significant parameters and their respective p-values are highlighted in bold.*P < 0.05, **P < 0.01.

Figure 1 .
Figure 1.Graphical representation of multiple pairwise comparison between glycemic levels for HbA1c and FBG cohorts.HbA1c,FBG, CRP, WBC, NLR and PLR levels in the study sample are dependent on glycemic control.After a significant multivariate MANOVA model, we first assessed variations between groups of different levels of glycemia for HbA1c, FBG, CRP, WBC, NLR, PLR and Platelets (Table 2A, B).Least significant difference (LSD) method as post hoc test was employed to determine significant difference between groups and control error rate at α-level of 0.05.Graphs from (A-F) represents pairwise analysis for HbA1c cohorts and graphs from (G-L) represents pairwise results for FBG cohort.Length of Bar represents mean values while 95% CI are illustrated as error bars.Parameters (A-E) & (G-L) had significant Anova result at p-value < 0.001 and significant results for multiple comparisons are shown with: Sig.(p): *p < 0.05, **p < 0.01, ***p < 0.001 , # p > 0.05.

Figure 2 .
Figure 2. Graphical representation of multiple pairwise comparison between glycemic levels for HbA1c and FBG cohorts for joint model.HbA1c, FBG, WBC, NLR and PLR, RBC, MCH, MCHC and MCV levels in the study sample are dependent on glycemic status.After a significant multivariate MANOVA model, we first assessed variations between groups of different levels of glycemia for HbA1c, FBG, CRP, WBC, NLR, PLR, platelets, RBC, MCH, MCHC and MCV (Table 2C, D).Least significant difference (LSD) method as post hoc test was employed to determine significant difference between groups and control error rate at α-level of 0.05.Graphs from (A-G) represent pairwise analysis for HbA1c cohorts and graphs from (H-N) represent pairwise results for FBG cohort.Length of bar represents mean values while 95% CI are illustrated as error bars.Parameters (A-N) had significant ANOVA result at p-value < 0.001 and significant results for multiple comparisons are shown with: Sig.(p): *p < 0.05, **p < 0.01, ***p < 0.001, # p > 0.05.

Figure 3 .
Figure 3. Linear discriminant analysis (LDA) results for predictive model consisting of inflammatory markers for estimation of dysglycemia.(A) The HbA1c cohort Combined plot of the discriminant functions generated from 5 inflammatory parameters.Each data point represents a single reading in the study sample.The plot illustrates close but distinctive clustering and separation of hyperglycemia (grey circles), borderline hyperglycemia (yellow squares) and normoglycemia (blue diamonds).Dark grey square represents group centroid.Dashed dark blue line depicts linear decision boundary.(C) Classification results for back substitution method for HbA1c with an accuracy of 81.2% original grouped cases correctly classified and after jackknife cross validation shows 80.8% accurate results.Linear discriminant analysis (LDA) results for FBG cohort (B) Combined plot of the discriminant functions generated from 5 inflammatory predictors.Dark grey square represents group centroid.Dashed dark blue line depicts linear decision boundary.(D) Classification results for back substitution method for FBG cohort with an accuracy of 82.3% original grouped cases correctly classified and after jackknife cross validation shows 80.8% accurate results.Individual group classification is highlighted in bold, model for HbA1c is showing highest accuracy in predicting borderline hyperglycemia whereas in the case of FBG showing normoglycemia to be grouped with maximum accuracy of 93.4 of the original grouped cases followed by borderline hyperglycemia which is showing 84.6% accuracy.

superimposed on inflammation status as an augmented discriminator of glycemic control
To investigate the effect of glycemic status on RBCs status and inflammation, MANOVA was performed for two sub-cohorts i.e., HbA1c (n = 28,577) and FBG (n = 8376) for 11 dependent variables including inflammatory markers (WBCs, NLR, PLR, Platelets) as well as molecular markers (RBC count, Hb (hemoglobin), HCT, and RBC indices including MCV, MCH, and MCHC).Group effect estimated from multivariate test exhibits F-ratio HbA1c of 960.097 (p < 0.0001) and F-ratio FBG of 286.112 (p < 0.0001) as indicated by Pillai's trace.To further analyze the effect of glycemia on molecular and inflammatory markers, a univariate one-way ANOVA test was employed for the HbA1c cohort.ANOVA revealed that there was a statistically significant difference Vol.:(0123456789) Scientific Reports | (2024) 14:1090 | https://doi.org/10.1038/s41598-023-44623-4www.nature.com/scientificreports/RBC

Table 3 .
Linear discriminant functions for the predictive modelling.Predictive model for chronic dysglycemia using inflammatory predictors.(A) Classification function coefficients for the HbA1c cohort.(B) Classification function coefficients for FBG cohort.Joint predictive model for chronic dysglycemia using inflammatory markers in combination with molecular markers.(C) Classification function coefficients of joint model for HbA1c cohort.(D) Classification function coefficients of joint model for FBG cohort.Fisher's linear discriminant functions.a Model significant at p < 0.001.b Predictive model of non-specific inflammatory markers for estimation of chronic Dysglycemia.c Joint predictive model of inflammatory markers in combination with erythrocytes status for estimation of chronic dysglycemia.Vol.:(0123456789) Scientific Reports | (2024) 14:1090 | https://doi.org/10.1038/s41598-023-44623-4www.nature.com/scientificreports/A